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LONG-TERM  GOALS 

The  long  term  objective  of  our  research  for  the  “High  Resolution  Air-Sea  Interaction”  (HRES) 
Departmental  Research  Initiative  (DRI)  is  to  identify  the  couplings  between  large  wave  events, 
winds,  and  currents  in  the  surface  layer  of  the  marine  boundary  layers.  Turbulence  resolving 
large  eddy  simulations  (LESs)  and  direct  numerical  simulations  (DNSs)  of  the  marine  atmo¬ 
spheric  boundary  layer  (MABL)  in  the  presence  of  time  and  space  varying  wave  fields  will  be 
the  main  tools  used  to  elucidate  wind-wave-current  interactions.  A  suite  of  turbulence  simula¬ 
tions  over  realistic  seas  using  idealized  and  observed  pressure  gradients  will  be  carried  out  to 
compliment  the  field  observations  collected  in  moderate  to  high  winds.  The  database  of  simu¬ 
lations  will  be  used  to  generate  statistical  moments,  interrogated  for  coherent  structures,  and 
ultimately  used  to  compare  with  HRES  observations. 


OBJECTIVES 

Our  near  term  goal  is  to  participate  in  the  planning  of  the  HRES  research  initiative.  This  in¬ 
cludes  developing  a  science  plan,  outlining  future  field  campaigns,  and  identifying  opportunities 
for  turbulence  modeling  studies.  Also,  during  the  planning  phase  of  the  DRI  we  intend  to  im¬ 
prove  the  parallelization  of  our  base  LES  code  in  order  to  take  full  advantage  of  the  modeling 
enhancements  that  will  be  develop  as  HRES  evolves. 


APPROACH 

We  plan  on  investigating  interactions  among  the  MABL,  the  ocean  boundary  layer  (OBL),  and 
the  connecting  air-sea  interface  using  both  LES  and  DNS.  The  waves  will  be  externally  imposed: 
(1)  based  on  well  established  empirical  wave  spectra;  or  (2)  ultimately  provided  by  direct  obser¬ 
vations  of  the  sea  surface  from  field  campaigns.  The  main  technical  advance  is  the  development 
of  a  computational  tool  that  allows  for  nearly  arbitrary  3-D  wave  fields,  i.e.,  the  sea  surface  ele¬ 
vation  r)  =  r)(x,y,t)  as  a  surface  boundary  condition.  The  computational  method  will  allow  time 
and  space  varying  surface  conditions  over  a  range  of  wave  scales  (9(10)111  or  larger. 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 
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WORK  COMPLETED 


The  HRES  initiative  is  just  underway  and  hence  the  work  reported  here  falls  in  the  category  of 
a  new  start.  First,  in  the  past  year  we  participated  in  the  process  that  generated  the  HRES  sci¬ 
ence  plan;  it  is  available  from  the  lead  authors,  Professors  K.  Melville,  C.  Friehe,  and  D.  Yue. 
The  plan  emphasizes  both  observations  and  modeling  of  the  wave  and  wind  fields  and  proposes 
a  pilot  experiment  in  Fall  2008  to  test  new  instrumentation  with  the  main  experiment  scheduled 
for  Spring  2010  off  the  California  coast.  A  variety  of  measuring  platforms,  viz.,  vessels,  R/P 
FLIP,  aircraft,  and  buoys  will  be  utilized  in  the  main  experiment. 

In  anticipation  of  the  high-resolution  computationally-intensive  turbulence  modeling  needed  for 
HRES,  we  re-visited  our  suite  of  simulation  codes  with  the  goals  of  improving  the  MPI  paral¬ 
lelization  (Message  Passing  Interface,  Aoyama  and  Nakano,  1999),  making  the  codes  compliant 
with  Fortran-90  programming  practice,  and  adding  MPI  I/O  (Gropp  et  al,  1998)  as  the  primary 
means  of  transferring  data  to  and  from  disk  files.  Basically,  the  flat  bottom  LES  code  was  com¬ 
pletely  re-written  with  the  above  constraints  in  mind.  E.  Patton  at  NCAR  contributed  heavily 
to  these  developments.  This  new  code  will  form  the  baseline  software  which  will  be  further  mod¬ 
ified  to  include  the  time  evolving  wavy  lower  boundary  described  in  Sullivan  et  al.  (2007). 

Our  previous  codes  use  a  single  domain  decomposition  procedure  that  splits  MPI  tasks  across 
the  z— direction.  Work  is  further  partitioned  in  x  —  y  planes  using  a  complicated  mix  of  threaded 
OMP  directives  (Chandra  et  al.,  2001).  Also,  a  global  (elliptic)  problem  for  pressure  is  solved 
as  required  for  an  incompressible  flow.  This  scheme  is  advantageous  since  it  does  not  split  Fast 
Fourier  Transforms  (FFTs)  across  spatial  directions  and  can  utilize  the  architecture  of  machines 
with  large  numbers  of  CPUs  per  computational  node  (e.g.,  the  IBM  SP5  with  16  CPUs/node). 
However  the  scheme  falls  short  on  other  computing  platforms  which  have  few  CPUs/node  (e.g., 
the  Cray  XT3  with  2  CPUs/node),  and  moreover  the  OMP  directives  require  continual  main¬ 
tenance  that  adds  to  the  code  complexity.  To  overcome  these  deficiencies  a  new  algorithm  was 
designed  based  on  the  criteria:  (1)  allow  arbitrary  2-D  domain  decomposition  using  solely  MPI 
parallelization;  (2)  preserve  pseudospectral  (FFT)  differencing  in  x  —  y  planes;  and  (3)  maintain 
a  Boussinesq  incompressible  flow  model. 

In  the  new  scheme,  each  processor  performs  its  operations  on  constricted  three-dimensional 
bricks  with  the  y  and  z  directions  truncated  as  shown  in  figure  1.  In  order  to  preserve  pseu¬ 
dospectral  differencing  in  the  horizontal  directions  a  custom  MPI  matrix  transpose  was  designed 
and  implemented.  The  routine  performs  the  forward  and  inverse  operations 
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on  the  field  /  using  a  subset  of  horizontal  processors  as  shown  in  figure  1.  In  (1),  subscripts 
(  )s>e  denote  starting  and  ending  locations  in  the  ( x,y,z )  directions.  Note  this  transpose  only 
requires  local  communication  between  processors  in  groups  [0  —  2],  [3  —  5],  and  [6  —  8].  Deriva¬ 
tives  df/dy,  which  are  needed  in  physical  space,  are  then  computed  in  a  straightforward  fashion 
using  the  sequence  of  steps:  forward  x  to  y  transpose  /  — >  fT,  FFT  derivative  dfT/dy,  inverse  y 
to  x  transpose  dfT / dy  — >  df/dy.  An  existing  serial  1-D  FFT  is  used  as  in  our  previous  codes. 

The  brick  decomposition  of  the  computational  domain  also  impacts  the  pressure  Poisson  equa¬ 
tion  solver.  In  an  incompressible  Boussinesq  fluid  model  the  pressure  p  is  a  solution  of  the  ellip¬ 
tic  equation 

V2p  = 


r, 


(2) 


where  the  source  term  r  is  the  numerical  divergence  of  the  unsteady  momentum  equations  (e.g., 
see  Sullivan  et  al.  1996).  The  solution  begins  with  a  standard  forward  2-D  Fourier  transform  of 
(2): 


{kl  +  ky)  p  +  —  =  r(ky,  kx,  z )  with 


kxs  <  kx  <  k 

zs  <  z  <  ze 


(3) 


where  (kx,  ky)  are  horizontal  wavenumbers.  At  this  stage  the  data  layout  on  each  processor  is 
as  shown  in  the  upper  right  panel  of  figure  1.  Custom  routines  carry  out  forward  ky  to  z  and 
inverse  z  to  ky  MPI  matrix  transposes  on  the  source  term  of  the  pressure  Poisson  equation: 
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kys  A  ky  U  kyf. 

The  storage  of  fT  allows  straightforward  tridiagonal  matrix  inversion  for  pairs  of  horizontal 
wavenumbers  on  each  processor;  this  yields  the  transposed  held  pT{z,  kxs  :  kxe,  kys  :  kye).  To 
recover  the  pressure  held  in  physical  space  we  retrace  our  steps:  pT  — >  p  followed  by  an  inverse 
2-D  Fourier  transform  p  — >  p. 


With  these  improvements  the  re-designed  algorithm  allows  very  large  number  of  processors 
C(103)  to  be  utilized.  An  important  feature  of  the  algorithm  is  that  no  global  (MPI  ALLTOALL) 
communication  between  processors  is  required.  We  have  introduced  more  communication  but 
the  messages  are  smaller  and  hence  large  numbers  of  gridpoints  can  be  used.  Also,  the  algo¬ 
rithm  permits  the  number  of  CPUs  to  exceed  the  number  of  gridpoints  in  the  vertical  direction 
allowing  turbulent  hows  in  large  horizontal  domains  to  be  simulated. 


RESULTS 

In  order  to  test  the  new  MPI  algorithm  outlined  above  we  simulated  convection  dominated  at¬ 
mospheric  boundary  layers  {e.g.,  Moeng  1984;  Sullivan  et  al.  1998)  using  different  meshes  and 
brick  decompositions.  An  illustrative  example  is  shown  in  figure  2  where  the  mesh  is  (1000  x 
1000  x  128)  gridpoints  and  the  number  of  CPUs  utilized  is  128.  Very  large  problems  with  20483 
meshes  have  also  been  run  on  8192  CPUs  of  a  Cray  XT4.  This  new  code  will  become  the  base¬ 
line  code  for  the  time  evolving  wavy  boundary  computations  in  HRES. 


IMPACT  /APPLICATIONS 

The  computational  tool  to  be  developed  and  the  database  of  solutions  that  will  be  generated 
will  aide  in  the  interpretation  of  the  observations  gathered  during  the  held  campaigns  of  HRES. 
In  addition  idealized  process  studies  performed  with  the  simulations  have  the  potential  to  im¬ 
prove  parameterizations  of  surface  drag  under  high  wind  conditions  in  large  scale  models. 


TRANSITIONS  &  RELATED  PROJECTS 

We  are  currently  engaged  in  analyzing  data  collected  during  the  Ocean  Horizontal  Array  Tur¬ 
bulence  Study  (OHATS)  and  the  Coupled  Boundary  Layers  Air-Sea  Transfer  (CBLAST)  held 
campaigns.  These  are  joint  efforts  between  NCAR,  and  numerous  university  investigators.  Also 
the  present  work  has  links  to  the  future  atmosphere/ocean  typhoon  initiatives  planned  by  ONR. 
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Figure  1:  Sketch  of  the  MP1  domain  decomposition 
and  matrix  transposes  used  in  the  new  incompressible 
Boussinesq  LES  code.  The  spatial  differencing  is 
pseudospectral  in  x  andy  directions  and  finite 
difference  in  the  vertical  z  direction.  Upper  left  panel 
shows  the  base  decomposition  of  the  total  domain  into 
constricted  horizontal  bricks  on  nine  processors 
[0-8];  upper  right  panel  illustrates  the  data  structure 
on  each  processor  after  an  x  toy  matrix  transpose 
used  to  compute  y- derivatives  using  a  standard  FFT; 
and,  the  lower  left  panel  shows  the  data  structure  on 
each  processor  used  in  the  tridiagonal  matrix 
inversion  of  the  pressure  solver. 


x  (km) 


Figure  2:  Visualization  of potential  temperature  field  in  an  x—  y  plane  at  z 
~  100m  from  an  LES  of  an  atmospheric  boundary  layer  driven  by  strong 
convection.  The  computational  domain  is  near  mesoscale  (50  x  50  x  3)km 
with  boundary-layer  resolution  (1000  x  1000  x  128)  gridpoints.  The 
computations  are  done  on  an  IBM  SP5  with  processor  count  equal  to 
128  CPUs.  Test  runs  have  also  been  carried  out  with  meshes  of 2048  3 grid 
points  utilizing  8192  CPUs  on  a  Cray  XT 4. 


